AITopics | propensity score function

Collaborating Authors

propensity score function

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Joint empirical risk minimization for instance-dependent positive-unlabeled data

Rejchel, Wojciech, Teisseyre, Paweł, Mielniczuk, Jan

arXiv.org Machine LearningDec-27-2023

Learning from positive and unlabeled data (PU learning) is actively researched machine learning task. The goal is to train a binary classification model based on a training dataset containing part of positives which are labeled, and unlabeled instances. Unlabeled set includes remaining part of positives and all negative observations. An important element in PU learning is modeling of the labeling mechanism, i.e. labels' assignment to positive observations. Unlike in many prior works, we consider a realistic setting for which probability of label assignment, i.e. propensity score, is instance-dependent. In our approach we investigate minimizer of an empirical counterpart of a joint risk which depends on both posterior probability of inclusion in a positive class as well as on a propensity score. The non-convex empirical risk is alternately optimised with respect to parameters of both functions. In the theoretical analysis we establish risk consistency of the minimisers using recently derived methods from the theory of empirical processes. Besides, the important development here is a proposed novel implementation of an optimisation algorithm, for which sequential approximation of a set of positive observations among unlabeled ones is crucial. This relies on modified technique of 'spies' as well as on a thresholding rule based on conditional probabilities. Experiments conducted on 20 data sets for various labeling scenarios show that the proposed method works on par or more effectively than state-of-the-art methods based on propensity function estimation.

assumption, dataset, probability, (15 more...)

arXiv.org Machine Learning

2312.16557

Country:

Europe > Poland > Masovia Province > Warsaw (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Poland > Kuyavian-Pomeranian Province > Toruń (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Double logistic regression approach to biased positive-unlabeled data

Furmańczyk, Konrad, Mielniczuk, Jan, Rejchel, Wojciech, Teisseyre, Paweł

arXiv.org Machine LearningOct-31-2023

Positive and unlabelled learning is an important problem which arises naturally in many applications. The significant limitation of almost all existing methods lies in assuming that the propensity score function is constant (SCAR assumption), which is unrealistic in many practical situations. Avoiding this assumption, we consider parametric approach to the problem of joint estimation of posterior probability and propensity score functions. We show that under mild assumptions when both functions have the same parametric form (e.g. logistic with different parameters) the corresponding parameters are identifiable. Motivated by this, we propose two approaches to their estimation: joint maximum likelihood method and the second approach based on alternating maximization of two Fisher consistent expressions. Our experimental results show that the proposed methods are comparable or better than the existing methods based on Expectation-Maximisation scheme.

artificial intelligence, estimation, machine learning, (17 more...)

arXiv.org Machine Learning

2209.07787

Country:

Europe > Poland > Masovia Province > Warsaw (0.04)
Europe > Poland > Kuyavian-Pomeranian Province > Toruń (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.82)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Wasserstein Random Forests and Applications in Heterogeneous Treatment Effects

Du, Qiming, Biau, Gérard, Petit, François, Porcher, Raphaël

arXiv.org Machine LearningOct-23-2020

We present new insights into causal inference in the context of Heterogeneous Treatment Effects by proposing natural variants of Random Forests to estimate the key conditional distributions. To achieve this, we recast Breiman's original splitting criterion in terms of Wasserstein distances between empirical measures. This reformulation indicates that Random Forests are well adapted to estimate conditional distributions and provides a natural extension of the algorithm to multivariate outputs. Following the philosophy of Breiman's construction, we propose some variants of the splitting rule that are well-suited to the conditional distribution estimation problem. Some preliminary theoretical connections are established along with various numerical experiments, which show how our approach may help to conduct more transparent causal inference in complex situations.

artificial intelligence, estimation, machine learning, (18 more...)

arXiv.org Machine Learning

2006.04709

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.82)

Add feedback

Private Causal Inference using Propensity Scores

Lee, Si Kai, Gresele, Luigi, Park, Mijung, Muandet, Krikamol

arXiv.org Machine LearningMay-29-2019

The use of propensity score methods to reduce selection bias when determining causal effects is common practice for observational studies. Although such studies in econometrics, social science, and medicine often rely on sensitive data, there has been no prior work on privatising the propensity scores used to ascertain causal effects from observed data. In this paper, we demonstrate how to privatise the propensity score and quantify how the added noise for privatisation affects the propensity score as well as subsequent causal inference. We test our methods on both simulated and real-world datasets. The results are consistent with our theoretical findings that the privatisation preserves the validity of subsequent causal analysis with high probability. More importantly, our results empirically demonstrate that the proposed solutions are practical for moderately-sized datasets.

artificial intelligence, machine learning, propensity score, (16 more...)

arXiv.org Machine Learning

1905.12592

Country:

North America > United States (0.28)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre:

Research Report > Experimental Study (0.70)
Research Report > New Finding (0.49)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Security & Privacy (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Learning from Positive and Unlabeled Data under the Selected At Random Assumption

Bekker, Jessa, Davis, Jesse

arXiv.org Machine LearningAug-27-2018

For many interesting tasks, such as medical diagnosis and web page classification, a learner only has access to some positively labeled examples and many unlabeled examples. Learning from this type of data requires making assumptions about the true distribution of the classes and/or the mechanism that was used to select the positive examples to be labeled. The commonly made assumptions, separability of the classes and positive examples being selected completely at random, are very strong. This paper proposes a weaker assumption that assumes the positive examples to be selected at random, conditioned on some of the attributes. To learn under this assumption, an EM method is proposed. Experiments show that our method is not only very capable of learning under this assumption, but it also outperforms the state of the art for learning under the selected completely at random assumption.

artificial intelligence, assumption, machine learning, (14 more...)

arXiv.org Machine Learning

1808.08755

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.83)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.54)

Add feedback

Stacked Propensity Score Functions for Observational Cohorts with Oversampled Exposed Subjects

Rose, Sherri

arXiv.org Machine LearningMay-19-2018

Observational cohort studies with oversampled exposed subjects are typically implemented to understand the causal effect of a rare exposure. Because the distribution of exposed subjects in the sample differs from the source population, estimation of a propensity score function (i.e., probability of exposure given baseline covariates) targets a nonparametrically nonidentifiable parameter. Consistent estimation of propensity score functions is an important component of various causal inference estimators, including double robust machine learning and inverse probability weighted estimators. We propose the use of the probability of exposure from the source population in observation-weighted stacking algorithms to produce consistent estimators of propensity score functions. Simulation studies and a hypothetical health policy intervention data analysis demonstrate low empirical bias and variance for these stacked propensity score functions with observation weights.

artificial intelligence, machine learning, propensity score function, (17 more...)

arXiv.org Machine Learning

1805.07684

Country: North America > United States > Massachusetts (0.28)

Genre:

Research Report > Experimental Study (0.96)
Research Report > Strength Medium (0.70)

Industry:

Health & Medicine > Government Relations & Public Policy (1.00)
Health & Medicine > Health Care Providers & Services > Reimbursement (0.96)
Government > Regional Government > North America Government > United States Government (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback